Word-Length Distribution in Inuktitut Narratives: Empirical and Theoretical Findings

نویسنده

  • Peter Meyer
چکیده

Tliis paper deals wilh ihe dislribution of word length in short native mylliological and hisuirieal Eskimo narrative texls. To my knowledge. no Eskimo-Aletil data have been ihe object of quantitative linguistie Investigation so t'ar. Dtte Io tlie strong linguistie and slylixlic honiogcncily of Ihe examined texts u was assumed tlial tltese texts tun be subsumed under a single law of Word lengih dislribution. il word lenglh distribulion ol a tcxi is eonsidered as a funclimi of eerlain of its properties. such as aulhor. language. and genre. So far. word lengih dislribution in texts of a wide variely of languages and genres has hecn demonstraled lo follow distribulions of ihe eompound Poisson lamily of diserele probabilily dislribulions. In view ol ihe morphological idiosynerasies of ihe Eskimo language in general, wliieli are responsihle for an unusually high mean word lenglli of aboul 4.5 lo 5.2 syllubles per word in Ihe texts, it is inleresting io sec whether Eskimo texts show a signifieantly different behaviour with respeel lo word length. The resuhs demonstrate ihm ihe Eskimo data cmployed in this study ean be filtcd well by die llyperpoisson dislribution, Two lurlher discrete probabilily distribulions will be dedueed from eerlain morphohifiy-bused assumptions about Eskimo. Il lurns otil Ihat rnost of die Eskimo data ean be filled by these two distribulions. The quesdon lo whal extern diese resulls point lo a more grammar-oriented iheory of word length is also diseussed.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The acquisition of ergativity in Inuktitut*

One potential challenge for children learning Inuktitut comes from the ergative case marking system, because of the contrast between the ergative system in morphology and the accusative system governing syntax. However, no studies have yet been published focusing on how Inuktitut-speaking children acquire ergativity. In this chapter, we investigate this process using naturalistic spontaneous sp...

متن کامل

Aligning and Using an English-Inuktitut Parallel Corpus

A parallel corpus of texts in English and in Inuktitut, an Inuit language, is presented. These texts are from the Nunavut Hansards. The parallel texts are processed in two phases, the sentence alignment phase and the word correspondence phase. Our sentence alignment technique achieves a precision of 91.4% and a recall of 92.3%. Our word correspondence technique is aimed at providing the broades...

متن کامل

Models for Inuktitut-English Word Alignment

This paper presents a set of techniques for bitext word alignment, optimized for a language pair with the characteristics of Inuktitut-English. The resulting systems exploit cross-lingual affinities at the sublexical level of syllables and substrings, as well as regular patterns of transliteration and the tendency towards monotonicity of alignment. Our most successful systems were based on clas...

متن کامل

NUKTI: English-Inuktitut Word Alignment System Description

Machine Translation (MT) as well as other bilingual applications strongly rely on word alignment. Efficient alignment techniques have been proposed but are mainly evaluated on pairs of languages where the notion of word is mostly clear. We concentrated our effort on the English-Inuktitut word alignment shared task and report on two approaches we implemented and a combination of both.

متن کامل

Evaluating a Morphological Analyser of Inuktitut

We evaluate the performance of an morphological analyser for Inuktitut across a mediumsized corpus, where it produces a useful analysis for two out of every three types. We then compare its segmentation to that of simpler approaches to morphology, and use these as a pre-processing step to a word alignment task. Our observations show that the richer approaches provide little as compared to simpl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of Quantitative Linguistics

دوره 4  شماره 

صفحات  -

تاریخ انتشار 1997